智能论文笔记

A Deep Learning Based Multitask Network for Respiration Rate Estimation -- A Practical Perspective

Kapil Singh Rathore , Sricharan Vijayarangan , Preejith SP , Mohanasankar Sivaprakasam

分类：机器学习

2021-12-13

可穿戴传感器的指数升高在日常活动中评估生理参数时已经获得了重大兴趣。呼吸率是在生活方式活动的性能评估中使用的重要参数之一。但是，测量，运动伪影和其他噪声的突兀设置使过程复杂化。本文介绍了基于深度学习（DL）的多任务架构，用于估计来自心电图和加速度计信号的瞬时和平均呼吸速率，使得它在日常生活活动中有效地执行，如骑自行车，行走等。多任务网络包括组合编码器 - 解码器和编码器 - IncesNet，用于获取平均呼吸速率和呼吸信号。可以利用呼吸信号以获得呼吸峰和瞬时呼吸循环。平均绝对误差（MAE），根均线误差（RMSE），推理时间和参数计数分析用于将网络与当前艺术机器学习（ML）模型和其他研究中开发的DL模型进行比较。基于各种输入的其他DL配置也是作为工作的一部分开发的。该拟议模型显示出更好的整体准确性，并且在不同活动期间的单个方式提供了更好的结果。

translated by 谷歌翻译

Lexicon and Attention based Handwritten Text Recognition System

Lalita Kumari , Sukhdeep Singh , VVS Rathore , Anuj Sharma

分类：计算机视觉

2022-09-11

手写的文本识别问题是由计算机视觉社区的研究人员广泛研究的，因为它的改进和适用于日常生活的范围，它是模式识别的子域。自从过去几十年以来，基于神经网络的系统的计算能力提高了计算能力，因此有助于提供最新的手写文本识别器。在同一方向上，我们采用了两个最先进的神经网络系统，并将注意力机制合并在一起。注意技术已被广泛用于神经机器翻译和自动语音识别的领域，现在正在文本识别域中实现。在这项研究中，我们能够在IAM数据集上达到4.15％的字符错误率和9.72％的单词错误率，7.07％的字符错误率和GW数据集的16.14％单词错误率与现有的Flor合并后，GW数据集的单词错误率等。建筑学。为了进一步分析，我们还使用了类似于Shi等人的系统。具有贪婪解码器的神经网络系统，观察到基本模型的字符错误率提高了23.27％。

translated by 谷歌翻译

A Lexicon and Depth-wise Separable Convolution Based Handwritten Text Recognition System

Lalita Kumari , Sukhdeep Singh , VVS Rathore , Anuj Sharma

分类：计算机视觉

2022-07-11

草书手写文本识别是模式识别领域中一个具有挑战性的研究问题。当前的最新方法包括基于卷积复发性神经网络和多维长期记忆复发性神经网络技术的模型。这些方法在高度计算上是广泛的模型，在设计级别上也很复杂。在最近的研究中，与基于卷积的复发性神经网络相比，基于卷积神经网络和票面卷积神经网络模型的组合显示出较少的参数。在减少要训练的参数总数的方向上，在这项工作中，我们使用了深度卷积代替标准卷积，结合了封闭式跨跨跨性神经网络和双向封闭式复发单元来减少参数总数接受训练。此外，我们还在测试步骤中包括了基于词典的单词梁搜索解码器。它还有助于提高模型的整体准确性。我们在IAM数据集上获得了3.84％的字符错误率和9.40％的单词错误率；乔治·华盛顿数据集的字符错误率和14.56％的字符错误率和14.56％的单词错误率。

translated by 谷歌翻译

Impact of Channel Variation on One-Class Learning for Spoof Detection

Rohit Arora , Anmol Arora , Rohit Singh Rathore

分类：机器学习

2021-09-30

基于保证金的损失，尤其是一级分类损失，提高了对策系统（CMS）的概括能力，但是由于欺骗攻击而随着通道变化的降解而未测试其可靠性。我们的实验旨在通过两种方式解决这个问题：首先，通过研究各种编解码器模拟的影响及其相应参数的影响，即比特率，不连续传输（DTX）和损失，对基于单级分类的性能CM系统；其次，通过测试基于保证金损失的各种设置在训练中的功效，并在编解码器模拟数据上评估我们的CM系统。还探讨了多条件培训（MCT）以及各种数据馈送和自定义的迷你批次策略，以处理新数据设置中的增加可变性，并找到最佳设置以执行上述实验。我们的实验结果表明，对嵌入空间的严格限制会降低单级分类模型的性能。 MCT相对将性能提高35.55 \％，自定义迷你批次捕获了新数据设置的更广泛的功能。而改变编解码器参数对对策系统的性能产生了重大影响。

translated by 谷歌翻译

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

Benjamin Wilson , William Qi , Tanmay Agarwal , John Lambert , Jagjeet Singh , Siddhesh Khandelwal , Bowen Pan , Ratnesh Kumar , Andrew Hartnett , Jhony Kaesemodel Pontes

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2023-01-02

We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

translated by 谷歌翻译

Application Of ADNN For Background Subtraction In Smart Surveillance System

Piyush Batra , Gagan Raj Singh , Neeraj Goyal

分类：计算机视觉 | 人工智能

2022-12-31

Object movement identification is one of the most researched problems in the field of computer vision. In this task, we try to classify a pixel as foreground or background. Even though numerous traditional machine learning and deep learning methods already exist for this problem, the two major issues with most of them are the need for large amounts of ground truth data and their inferior performance on unseen videos. Since every pixel of every frame has to be labeled, acquiring large amounts of data for these techniques gets rather expensive. Recently, Zhao et al. [1] proposed one of a kind Arithmetic Distribution Neural Network (ADNN) for universal background subtraction which utilizes probability information from the histogram of temporal pixels and achieves promising results. Building onto this work, we developed an intelligent video surveillance system that uses ADNN architecture for motion detection, trims the video with parts only containing motion, and performs anomaly detection on the trimmed video.

translated by 谷歌翻译

Active Learning for Neural Machine Translation

Neeraj Vashistha , Kriti Singh , Ramakant Shakya

分类：自然语言处理 | 人工智能

2022-12-30

The machine translation mechanism translates texts automatically between different natural languages, and Neural Machine Translation (NMT) has gained attention for its rational context analysis and fluent translation accuracy. However, processing low-resource languages that lack relevant training attributes like supervised data is a current challenge for Natural Language Processing (NLP). We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation. With active learning, a semi-supervised machine learning strategy, the training algorithm determines which unlabeled data would be the most beneficial for obtaining labels using selected query techniques. We implemented two model-driven acquisition functions for selecting the samples to be validated. This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM) , active learning least confidence based model (ALLCM), and active learning margin sampling based model (ALMSM) when translating English to Hindi. The Bilingual Evaluation Understudy (BLEU) metric has been used to evaluate system results. The BLEU scores of BM, FTM, ALLCM and ALMSM systems are 16.26, 22.56 , 24.54, and 24.20, respectively. The findings in this paper demonstrate that active learning techniques helps the model to converge early and improve the overall quality of the translation system.

translated by 谷歌翻译

POMRL: No-Regret Learning-to-Plan with Increasing Horizons

Khimya Khetarpal , Claire Vernade , Brendan O'Donoghue , Satinder Singh , Tom Zahavy

分类：人工智能 | 机器学习

2022-12-30

We study the problem of planning under model uncertainty in an online meta-reinforcement learning (RL) setting where an agent is presented with a sequence of related tasks with limited interactions per task. The agent can use its experience in each task and across tasks to estimate both the transition model and the distribution over tasks. We propose an algorithm to meta-learn the underlying structure across tasks, utilize it to plan in each task, and upper-bound the regret of the planning loss. Our bound suggests that the average regret over tasks decreases as the number of tasks increases and as the tasks are more similar. In the classical single-task setting, it is known that the planning horizon should depend on the estimated model's accuracy, that is, on the number of samples within task. We generalize this finding to meta-RL and study this dependence of planning horizons on the number of tasks. Based on our theoretical findings, we derive heuristics for selecting slowly increasing discount factors, and we validate its significance empirically.

translated by 谷歌翻译

DeepCuts: Single-Shot Interpretability based Pruning for BERT

Jasdeep Singh Grover , Bhavesh Gawri , Ruskin Raj Manku

分类：自然语言处理 | 人工智能 | 计算机视觉 | 机器学习

2022-12-27

As language models have grown in parameters and layers, it has become much harder to train and infer with them on single GPUs. This is severely restricting the availability of large language models such as GPT-3, BERT-Large, and many others. A common technique to solve this problem is pruning the network architecture by removing transformer heads, fully-connected weights, and other modules. The main challenge is to discern the important parameters from the less important ones. Our goal is to find strong metrics for identifying such parameters. We thus propose two strategies: Cam-Cut based on the GradCAM interpretations, and Smooth-Cut based on the SmoothGrad, for calculating the importance scores. Through this work, we show that our scoring functions are able to assign more relevant task-based scores to the network parameters, and thus both our pruning approaches significantly outperform the standard weight and gradient-based strategies, especially at higher compression ratios in BERT-based models. We also analyze our pruning masks and find them to be significantly different from the ones obtained using standard metrics.

translated by 谷歌翻译

Higher order organizational features can distinguish protein interaction networks of disease classes: a case study of neoplasms and neurological diseases

Vikram Singh , Vikram Singh

分类：机器学习

2022-12-26

Neoplasms (NPs) and neurological diseases and disorders (NDDs) are amongst the major classes of diseases underlying deaths of a disproportionate number of people worldwide. To determine if there exist some distinctive features in the local wiring patterns of protein interactions emerging at the onset of a disease belonging to either of these two classes, we examined 112 and 175 protein interaction networks belonging to NPs and NDDs, respectively. Orbit usage profiles (OUPs) for each of these networks were enumerated by investigating the networks' local topology. 56 non-redundant OUPs (nrOUPs) were derived and used as network features for classification between these two disease classes. Four machine learning classifiers, namely, k-nearest neighbour (KNN), support vector machine (SVM), deep neural network (DNN), random forest (RF) were trained on these data. DNN obtained the greatest average AUPRC (0.988) among these classifiers. DNNs developed on node2vec and the proposed nrOUPs embeddings were compared using 5-fold cross validation on the basis of average values of the six of performance measures, viz., AUPRC, Accuracy, Sensitivity, Specificity, Precision and MCC. It was found that nrOUPs based classifier performed better in all of these six performance measures.

translated by 谷歌翻译